Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding

نویسندگان

  • Masatsune Tamura
  • Takehiko Kagoshima
  • Masami Akamine
چکیده

In this paper, we propose a sub-band basis spectrum model which is a new spectrum representation model based on a linear combination of sub-band basis vectors. We apply sparse coding to the pitch-synchronously analyzed log-spectra. Based on the approximation of the resulting basis, we obtain subband basis vectors with 1-cycle sinusoidal shapes that have mel-scale for lower frequencies and equally spaced scale for higher frequencies. Parameters of the sub-band basis spectrum model representing the log spectrum and the phase spectrum are calculated by fitting the basis to the spectrum. Since the parameters represent the shape of a spectrum, it can be easily used for voice adaptation, interpolation and conversion. Experimental results show that the analysis synthesis speech based on the proposed model is close to original speech and that there is no significant difference between the synthetic speech using analysis-synthesis database and those using original database for unit-fusion based TTS[1].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of spectrum-volume fractal modeling for detection of mineralized zones

The main goal of this research work was to detect the different Cu mineralized zones in the Sungun porphyry deposit in NW Iran using the Spectrum-Volume (S-V) fractal modeling based on the sub-surface data for this deposit. This operation was carried out on an estimated Cu block model based on a Fast Fourier Transformation (FFT) using the C++ and MATLAB programing. The S-V log-log plot was gene...

متن کامل

Pitch-synchronous Speech Coding Based on Timbre Vectors

A pitch-synchronous method and system for speech coding using timbre vectors is disclosed. On the encoder side, speech signal is segmented into pitch-synchronous frames without overlap, then converted into a pitch-synchronous amplitude spectrum using FFT. Using Laguerre functions, the amplitude spectrum is transformed into a timbre vector. Using vector quantization, each timbre vector is conver...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Effect of Nitric acid on Particle Morphology of the Nano-TiO2

Nano-sized titanium dioxide TiO2 powder was prepared by new wet chemical route from its precursor Titanium (IV) chloride (TiCl4) as precursor with isopropoxy alcohol in presence of nitric acid under ambient condition. Their morphologies, phase compositions and components of the TiO2 nanoparticles were characterized by transmission electron ...

متن کامل

A new approach to modeling excitation in very low-rate speech coding

A new method for two-band approximation of excitation signals in an LPC model, to improve speech naturalness in very low rate coding, is proposed. Based on a simpli ed model of Multi-Band Excitation, the method accurately determines the degree of periodicity, using the concept of Instantaneous Frequency (IF) estimation in frequency domain. The harmonic structure in the spectrum of LPC residual,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010